Introduction

The main objective of this paper is to map out and visualize the Covid cases, deaths in the United States and NY state by County based on data collected as of June 8th, 2020.

Loading Data

Firstly, we are loading the County Spatial data using readOGR function and Covid Cases csv file as of 0608 date. For this analysis, we are only considering the covid cases as of latest date and filtering the data for 2020-06-08 date. In order to merge Covid cases data and Counties, we enriched the fips(Covid data) and CNTYIDFP(from County data) as per need.

setwd("~/Downloads/HU/2020Summer/DataViz512/5_0606V/gisData")

counties = readOGR(dsn=".",layer="cb_2016_us_county_500k")
## OGR data source with driver: ESRI Shapefile 
## Source: "/Users/akhilasaineni/Downloads/HU/2020Summer/DataViz512/5_0606V/gisData", layer: "cb_2016_us_county_500k"
## with 3233 features
## It has 9 fields
## Integer64 fields read as strings:  ALAND AWATER
covid = read.csv("us-counties_0608.csv")
covid = covid[covid$date == '2020-06-08',]
covid$fips = as.character(covid$fips)
covid$fips = ifelse(nchar(covid$fips) == 4, paste0("0",covid$fips), covid$fips)
summary(covid)
##          date             county          state          fips          
##  2020-06-08:3011   Washington:  30   Texas   : 235   Length:3011       
##  2020-01-21:   0   Jefferson :  26   Georgia : 160   Class :character  
##  2020-01-22:   0   Unknown   :  26   Virginia: 130   Mode  :character  
##  2020-01-23:   0   Franklin  :  25   Kentucky: 119                     
##  2020-01-24:   0   Jackson   :  23   Missouri: 110                     
##  2020-01-25:   0   Lincoln   :  23   Illinois: 101                     
##  (Other)   :   0   (Other)   :2858   (Other) :2156                     
##      cases              deaths        
##  Min.   :     0.0   Min.   :    0.00  
##  1st Qu.:    13.0   1st Qu.:    0.00  
##  Median :    53.0   Median :    1.00  
##  Mean   :   654.5   Mean   :   36.85  
##  3rd Qu.:   234.5   3rd Qu.:    8.00  
##  Max.   :212122.0   Max.   :21356.00  
## 
head(covid)
##              date  county   state  fips cases deaths
## 215617 2020-06-08 Autauga Alabama 01001   273      5
## 215618 2020-06-08 Baldwin Alabama 01003   335      9
## 215619 2020-06-08 Barbour Alabama 01005   198      1
## 215620 2020-06-08    Bibb Alabama 01007    82      1
## 215621 2020-06-08  Blount Alabama 01009    75      1
## 215622 2020-06-08 Bullock Alabama 01011   240      9
counties$CNTYIDFP<-paste0(counties$STATEFP,counties$COUNTYFP)
merge<-merge(counties, covid, by.x ="CNTYIDFP", by.y ="fips")

Methods

To conduct this analysis we used expss functions to create baseline statistical tabulations of Covid positive cases and deaths reported for each state. The highest number of Covid positive cases and deaths are in New York state 383,591 and 30239 respectively.

data = apply_labels(covid,
                    cases="Total COVID 19 Cases",
                    deaths="Total COVID 19 Deaths", 
                    county="County Name",
                    state="State Name"
)

data %>%
  tab_cells(cases, deaths) %>% 
  tab_cols(total(label = "#Total| |"), state) %>%
  tab_stat_fun(TotalCases=w_sum, method=list) %>%
  tab_pivot() %>%
  tab_transpose()
 Total COVID 19 Cases   Total COVID 19 Deaths 
 #Total 
      1970613 110966
 State Name 
   Alabama  20925 718
   Alaska  607 8
   Arizona  27761 1052
   Arkansas  9740 155
   California  134287 4679
   Colorado  28169 1543
   Connecticut  44092 4084
   Delaware  9972 398
   District of Columbia  9389 491
   Florida  64896 2711
   Georgia  49995 2176
   Guam  1149 6
   Hawaii  664 17
   Idaho  3197 83
   Illinois  128819 5964
   Indiana  38553 2316
   Iowa  22111 623
   Kansas  10724 237
   Kentucky  11701 487
   Louisiana  43163 2944
   Maine  2588 99
   Maryland  59024 2776
   Massachusetts  103626 7353
   Michigan  64911 5916
   Minnesota  28235 1208
   Mississippi  17768 837
   Missouri  15159 832
   Montana  548 18
   Nebraska  15752 196
   Nevada  9815 442
   New Hampshire  5079 286
   New Jersey  164497 12214
   New Mexico  9062 400
   New York  383591 30239
   North Carolina  36581 1041
   North Dakota  2883 75
   Northern Mariana Islands  28 2
   Ohio  38837 2404
   Oklahoma  7206 348
   Oregon  4925 164
   Pennsylvania  80432 6007
   Puerto Rico  5046 142
   Rhode Island  15642 799
   South Carolina  14800 557
   South Dakota  5471 65
   Tennessee  27217 417
   Texas  77326 1856
   Utah  12378 124
   Vermont  1075 55
   Virgin Islands  71 6
   Virginia  51251 1477
   Washington  25593 1168
   West Virginia  2161 84
   Wisconsin  21161 650
   Wyoming  960 17

Results

Total Covid Cases in United States by County

We created a leaflet map showing the total number of Covid positive cases in the United States, which is as follows. The label shows name of the county and state along with number of covid cases and deaths per county with C and D prefixed respectively.

pal = colorQuantile("Reds", covid$cases, n = 9)

leaflet(merge) %>% setView(-98,39, zoom=4) %>%
  addPolygons(weight=.10, color="blue",fillOpacity = .2, fillColor = ~pal(cases),
              label= paste(merge$NAME, ",", merge$state, ":", "C", merge$cases, "D", merge$deaths))

Total Covid Deaths in New York State by County

We created a leaflet map showing the total number of Covid deaths in the New York state, which is as follows.

covid_newyork=subset(covid, covid$state=="New York")

merge_ny<-merge(counties, covid_newyork, by.x ="CNTYIDFP", by.y ="fips")

pal_ny <- colorQuantile("Blues", domain = covid_newyork$deaths, n=5)

leaflet(merge_ny) %>% setView(-74,43, zoom=6) %>%
  addPolygons(weight=.30,color="green",fillOpacity = .2, fillColor = ~pal_ny(deaths),
              label= paste(merge_ny$NAME, "COVID 19 Deaths:", merge_ny$deaths ))

Bibliography

expss: Tables with Labels in R 2019-07-06. Retrieved from https://gdemin.github.io/expss/

tables: Functions for custom tables construction Retrieved from https://rdrr.io/cran/expss/man/tables.html

Leaflet for R - Colors. Retrieved from https://rstudio.github.io/leaflet/colors.html

Covid Data retrieved from https://www.nytimes.com/article/coronavirus-county-data-us.html